y
p
y
q
The SOM map constructed based on the 3-mer data of 58,897 SARS-CoV2
from four countries. The letters a, b, c and d were used to stand for USA, India,
Brazil, respectively. They were coloured in red, blue, green and black in this
der to examine whether the SOM map truly reflected the data
of the sequences, the fitness of the SOM neuron occupancy
ges to the percentage distribution of the sequences from different
in the whole data was tested. Table 7.22 shows the test result.
square test p value of two percentages was 0.996 meaning that
entage distributions were almost identical. Therefore, the SOM
ruly reflected the genomic deviation of these sequences from four
. In addition, 53 neurons were mixed by sequences from more
country. In total, 1,355 sequences (2.3%) were mapped onto these
ns. This error rate also implied that this SOM map has well-
the original genomics pattern or structure hidden in these
s from four countries.